[OSDC.fr 2013] In Memory Big Data Analytics with the Apache Spark Ecosystem

2014-02-18 1

OSDC.fr 2013 — Sam Bessalah
» http://osdc.fr/2013/talk/5038
» http://act.osdc.fr/osdc2013fr/slides/SamBessalah-Spark.pptx

The big data ecosystem has evolved with a myriad of projects, mostly rotating around Hadoop. The Spark project first designed has an extension to the latter, has turned into a full blown ecosystem for lightning fast computations for iterative algorithms on large datasets, an integrated graph processing framework, an in-memory filesystem, a stream processing engine and a new machine learning library that holds the promise of revolutionizing the way we do complex analysis of large data sets.
In this talk we will go through the different component of the Spark analytic stack and see why a lot of people have taken a strong interest to it.